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ABSTRACT 


The regression model Y = XB + e, with enN(0,o*1), has 
been studied extensively. That is, the model in which the 
errors are independent and identically distributed as 

N( ©,0*) has been studied already. 

In this thesis we study the model in which the sample 
observations are correlated with a prescribed correlation 
structure and show that many of the results available for 
the independent case apply equally well for the correlated 
samples. 

We shall find that some results obtained here are not 


just the same as the case where the errors are independent 


and identically distributed as N(0,¢*). 
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I. INTRODUCTION 


Regression analysis is designed for situations where a 
Variable is thought to be related to one or more other 
measurements made, usually, on the same objects. A pur- 
pose of the analysis is to use data (observed values of 
the variables) to estimate the form of this relationship. 

The problem we consider is that of estimating or pre- 
dicting the value of a dependent random variable y on the 
basis of some known measurements of an independent controlled 
variable x. Scientists, economists, psychologists, and 
sociologists have always been concerned with the problems 
of prediction. Meterologists are constantly analyzing 
data in hopes of predicting or forecasting with a high 
degree of accuracy. 

An example would be to use information on weight and 
height of people to estimate the extent to which a man's 
weight is related to his heignt. If among people one was 
picked at random we might expect his weight to be a + bx 
(unere x is height), namely, E(y) = a + bx (where y is 
weight). In gathering data, the weight of every man with 
height x will not be exactly a + bx. Therefore the differ- 
ence, Yi ~ Ely;) , can be written as 


eq = y; - Ely;) = yy - a - Dx; 


hence yy = &@ + bxy t+ ey. 





The man's age can also be considered to be a factor affecting 
weight. Then the model is extended to be 
E(y) = a + 64x, + boxXs. 
More generally, we can extend the equation to be 
E(y) = a + bax, + boXa +o 6 et by xX, 


A frequently used notation is 


= + « ¢ @« 
Ya = Sag Woe Sesie ae oi)? 


Ogee e--- ori Wa ull Xs = 14 tere eae 


Now define the following matrices and vectors: 


256110) NE 1X4 7 ay 
| Be iis aaa sa i + i 
ees 
"T6 *nt pe’ *nk ' Xn1 *nk | 5 
a Ys e, 
B= | > y =| V2 e = ieee 
iat / t %nj ) bon J e 


Then we write 

= Meeeee 4. Win ne) = XB. 
The model Y=XB + e , with e~N(0,a*1I), has been studied 
extensively. That is, the model in which the errors are 
independent and identically distributed as N(0,c”) has 


been studied already. 





In this thesis we study the model in which the sample 
observations are correlated with a prescribed correlation 
structure and show that many of the results available for 
the independent case apply equally well for the correlated 


samples. 


In this thesis, we shall examine the model 


I< 
li 
- 
ae 
oO 


, with e~N(0,V) , where 
V=eX%( AtAt) +t+A(I-E), 


a. a, a, 
Mae 2 a0 °°. “2 
a A a 
non n / 9 


A’ is the transpose of A, a. ( i = 1,2,:--,n) ando& are posi- 
tivenwconstants, [ is ann x n identity matrix, and E is an 
nxn matrix all of whose elements are unity. 

We shall find that some results obtained here are not 
just the same as the case where the errors are independent 


ana aaentically distributed as N(O,o), 





IIT. SUMMARY OF KNOWN RESULTS 


NN 
on 


“ww 
fw oO PROPERTIES OF B AND +” UNDER NORMAL THEORY 


The model Y = XB + e, e~N(0, 1), where the components 


peo and oa” are unknown, and X is a known matrix of rank 


em 1 


Since the vector of errors e is normally distributed, 


the maximum-likelihood method will be used to estimate B 


Pa 
aac GO”. 


Ito> 
T 


Solving the likelihood equation we get 
“w 
PO 


2 = 87 XY » where g's ( 


O'>«- 


Kk. 


(y¥-xXB)'(x¥-xB8) xt (1- xs'x') ¥ 


n- (k+1 ) n- (k+1 ) z 


wN 
B and 3 have .the following properties: 


consistent 
efficient 
unbiased 
Suiiivedent 

A a | 
ae eS) 
complete 


minimum variance unbiased 
° ~ 
| e 2 
| a la O66 ape = ale) 
ox oe = 
B and g~ are independent. 








B. SOME THEOREMS USED IN DERIVING THE RESULTS 


' 
Theorem 1. If Y is distributed as N(AM oI), Ge nee eee 
¢ no 


z. / 
is distributed as X(k,”%), where A= haw and k = rank 
Peps 
of A, if and only if A is idempotent. 


Theorem 2. If Y is distributed as N(AV,V), then Y'BY is 


Ps 
distributed as “X (k,A), where A= (Bul and k = rank of 


a 

2.‘= — 

beet and only if BV is idempotent. 

Theorem 3. If Y~N(U,V), then Y'AY and Y'BY are independent 
if and only if AVB = OQ. 

Theorem 4. If Y~N(U,\V), then C'Y and Y'AY are independent 


if and only if C'VA = QO. 


Theorem 5. (Hogg and Craig theorem) 
Let & = Q,tQ,+°'°° +Q,, where Q9,9,,°-- ,Q,and Q are k + 1 


random variables that are quadratic forms in the observa- 
tions of a random sample of size n from N( AL, a"). 

= ve X(r), ret be accep eS bp 2 het ee lhemals Om aya 
let Qi be non-negative. Then the random variables Qa» Qo» 


Let 


“"", Q, are mutually stochastically independent and, hence, 


+ oe, (r, =~ = oy) 


Theorem 6. (Baldessari theorem) 
Let Y be N(4, V) and By B, ", By, be (nxn) idempotent 
matrices satisfying 
_— . 
f& B, ag a , where I and E are matrices as 


ae 


defined above. 


Let * be a positive constant. 


10 





ee RS ee ge 
a : ; a 7 : 


a 7 
_ : = 





Then a necessary and sufficient condition for aE ; 


j = 1,2,°:-,k , to be mutually independent and have 
non-central Chi-square distributions with r, (r =rank of Bs 
 ] 


me ,2,-:° ,k) degrees of freedom is that the covariance 


matrix V has the following structure 


ale 





7 -1 


N 


Pies PROPER TLS OF 


B 
UNDER THE PRESE 


We consider the model 


Y=xXBte, e-N(0,V), where V=%(A+A')+ x (I-E) : 


_ A A | 
Let BeS’'x'y ang X'CY | (¥-XB)'(¥-XB) = ¥'(I-XS™ X"')¥ 
co a x 





Now we will show that 
a a(B) = B 


isin OO 2 
aS ~X(n-k- 1) 


C. E(y'CY)= & (n-k-1) 9 and 


[td > 


D. and Y'CY are not independent. 


A. EXPECTED VALUE OF B 
‘ 


E(B) = E(S"'x'y) = S°'X'E(¥) = S”'X'E(XB + @ ) 
ite 
a -1 : 
Hence B =S “X'Y is an unbiased estimate of B 


2g pet ( 
peels TRIBUTION OF = —— 
cs 





s 2 : 
We have seen that WeUSa lat ~X(n-k-1) , where (*. ate 
Dae aa 


oO 
=e 
Hnewetore we can conjecture that Y'(I-XS  A')Y 
A 


is distrib- 


uted as Xo (n-k-1) - the necessary and Sufficient condition 


for XY'CY to be Y*(n-k-1) is that + CV is idempotent. 


r K 
Now we want to show that + CV = 1 (1-xs7! 


X')Vis idempotent, 
-1 on vac 
that is, to show that 1(1-XS° X')V(I-XS X') = I-xS° x’. 


ie 





Note  2(I-XS'X')V(I-XS" X!) 
= (1-x87'X')((AtA")+ (IE) (1-XS7'X") 
7 (1-x87'X") , 
: -1 Bs ee = 
since XS X'E-E , XS AtA'=A and AXS X'=A . 
z: 

emeegeby theorem 2, _1 22 ~.Y¥(q,nr) 
A 
-1 

where q = rank (I -xXS X' ) 


Baka i KS XK! 


tr ( 


Jr 


- xs 


= tr (1) -tr ( x'xs”’ ) 


= 00a ae 
since res is itself symmetric and idempotent, 
-1 
and » =k rx | I- XS X! ) XB=0, 


A 
Therefore yt cy OG ( nek=-1 ). 
vw 


C. EXPECTED VALUE OF Y'CY 


mele 
From B it is obtained directly that E(¥'(I-XS X')¥)= 


A(n-k-1) also then XCY 1s an unbiased estimate ofa. 


n-k-1 


=3S oe "Y AND Y‘'CY ARE NOT INDEPENDENT 


se 


Ele 
To show that they are not independent, let's show 
s7'x'vc = SUR ULaS” 21) So Oe et arena) oe 
sv'xtv(r-x87' x )=s7 1x! (s(atat)+ A (1-E)) (1-X87'X") 

ma(s"tyratesrarxs x") 49» 


We recognize that 8 and Y'CY are not independent, which 


iemner the same results as obtained in the i1.i.d. case. 


13 





iN 
For reference, the variance and covariance of B is 


provided as follows: 


S 
Py 


- = = -j_ -- ~ -1 
a-4(n-1)/n+(nax'-a'X, +4x")S "x (72a 'X, -enax'-Ax")S, 
7 ae -‘ 
85! (XJ a~!max-4) a 3) 
a nN 
where a = i ae » at = ( a 8, --- 4, ys 
a = = zi n . 
cK Keg Kents Key M3 i Melanson t Lor Teles, 
sh ant 
X,= 
Xi ees *nk 
and 


et =, Ree 
34 = X, Ky NXX : 


Note that & factors from the lower right-hand corner, 
but not from the rest of the matrix. This causes trouble 
because the unknown parameter A does not cancel out in 
Honmiume the usual Student t statistic. 

Therefore we develop the new modified model without 


by in the next chapter. 
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Cr 


A “tt 
IV. SOME PROPERTIES OF B, AND XY (I- xX"S' xX )¥ 


UNDER _ THE NEW MODEL 


We consider the model 


me - Xx"B, iu e" > 
Pnemomme so = 7) —e(e Toma (1/n) En )y 


Pa 


xX = X4-(x.44 


& 


Keyl) = CL = OME DX, 


eee 4 Pa = eb). )* 
e =se-@i=(I1- (ME )e . 


Note then that 74 = veey ; oo ) and thus 


ie : % a 
Ys= 2 x,,=0, that is, i'Y =O, i1'X =O. 


i=l kK =| 


Then 

Vie") =X(I- (nz), 

Ceres NO ge in) 
and ¥Y°~ NC XB, , AC I- (ME) ) 


under our original assumptions. 


wo ALa q &, 
cay 


a ae : seece 
Let B= io) WX i 7 unlere a= A aA 4, and 
A a a et mm 1! ot % ey Aron co 
yiocy'. ¥ (i-x’s7 X UX 2 (4 -X By) tO -X By) 
A mS A 


We are going to show 


aN 
A. The expected value of E, 


“mf ™& 


ve 
fay 


B. The distribution of XY CY 


4 se 
#en of 
Cy 


FL 
C. The expected value of Y™ 


A ae! ae : 
D. That B, and ¥Y CY are independent. 


1 


15 





A. EXPECTED VALUE OF B, 


ee 
B(B,) = B(S |X" ¥") 
Be otf ot 
fe MY) 
ge mee ey =B, - 
“ wel Lal : ; : one 
Hence B,=S A Lf is an unbiased estimate of B, it is 
ae ee ab 
easily shown that B, is the same as B, in B= (20) ; 
By 
aI 
B. DISTRIBUTION OF (1fA)(¥" CY’) 
x eal! nate 
Since OC” & (I-4E)=1-X Spas -iE is idempotent, 


a oe 
(1/a)¥™ c*yN4,%) by Th. 2, where 


aes 
) 


q = rank C/o as = X 


1%! 


rank (I-x"s"7 


% a i 
and > = uBIX"' (1/4) (I-X"S*"' Xx" )X"B, = 0 
sets st ype 


sets 
feeeresCTED VALUE OF <£ C Xx 
eto | 
It is obtained directly from B that B(Y C ¥ )=a(n-k). 
setae ge 
Also it is easily seen that as is an unbiased estimate 


Oa , 
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~ a %Y 
B,=S Sk NID) eS (Ge ARES TNBE PENDENT 


' 
STEAL a 
SX &(I-M/n)E)(I-X's° xX” ) 
ae “ft _ t * ta mf 
Mes x" (i /nys’ x" B) (1-x°8"" x") 
5a | 
a0 (note (8 -=0'2)1 4, 


Therefore, by Th. 4, they are independent. 


1 








V. INTERVAL ESTIMATION AND PREDICTION INTERVAL 


The construction of interval estimators and a prediction 
interval to contain a single future observation in the i.i.d. 
case with a regression model has been discussed in many 
standard texts. In this chapter we extend these results 
to the case where the samples are correlated with a special 
type of covariance structure. We shall discuss 


; A A 
A, Covariance of B, , V(B, ) 


4 


Bee lnterval estimation of r'B, 


Gee Prediction interval for one future observation. 


A. COVARIANCE OF 8, , v(B,) 
Since B, loecqualmco thesproduct Of a constant matrix 


oe ae | . ° z 
5 'y and a normally distributed vector Y*, B, has the 


—_ 


f=Varaate normal distribution. 


We have already shown that the mean of B, 1s) B,: 


™~ 
The covariance matrix of 2, 1s 


aoa oe ee 


30 sot 3 
=E((S° xX" ¥°-B,)(S” X° X¥"=-B,)') 


—, = , 
= 4 sf S46 
= -_ or “mn 


xt iy 7 x 7 
Xx" (X"B,+e")-B,)(S°7'X" (X"B, te") -B, )") 
Cy tt ta 


A fad 
So B, is distributed N( B, A ce jee 
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B. INTERVAL ESTIMATION OF rie, 


Frequently, an experimenter is interested in setting 
confidence limits on some function of Dd; . A method is 
available for one of the most frequently occurring cases, 
mee,, 1Or the case of a linear function of the or : 

We, here, discuss the case of a linear function of the D. 
(except do Dreaibet r be a known k x 1 vector of constants, 
then, to set confidence limits of size i-d, on rB, ,we 
proceed as follows. 


a 


Since r'B, is distributed N(r'B,,4r Is” io MR. Slee ais 


a 


clear that By or 'B, ieedastripused N\O,.). We alco 
(ar? s° ie 


ou A 
know Y° CY UG (nek) and independent of By. 


x, 
P L, 
Therefore, r'B,-r'B, (n-k) : t(n-k ) 
Sas | eens on 
URUS ac) ea oY 


We arrive at the probability equation 
P(p'By ct é ze) £ riB,tota ) = 1-1 
aaa 


where c = (r's” Teac eC al 2 
C. PREDICTION INTERVAL FOR ONE FUTURE OBSERVATION 
Hahn / 4_/ has derived prediction intervals to contain 
some future observations of 1.1i.d. random variables from a 
normal distribution for a regression model. In this section 
we extend this result to the case where the samples are 
correlated and have a special type of covariance structure 


to construct prediction interval to contain a single future 


observation. 


19 





het y be the one future sample. 


ge 
Define Y to be the (nt+1) component vector including the 


original sample of size n and y We assume Y-~N(XB, V), 


igs eS 
where X is (n+1)x(kt+1i), of rank k+i and V is as defined in 


Mmmiveand 1s (nt+i)x(nti). 


Define y* as follows: 


Oiaedl 
rit | 


Tn ng17ner 7° WEE = Matt) 895 


mn & 


Then, y, 44 i — ) = inégnes) C3 eae) 
where ype (V/n) Fy 


and eas can be expressed 


ag 


k ‘pier Ke ont ) 


ner = (A/ (049) ) (Oa say Pone4 a 


ala ocrsla | Bi Te nae oe ) ’ 


Xe ws 


x % yn, 


where De uate ( it. 1 feet) 2 eee Xnt1 k 


We also, therefore, know E(y- 4, )=nAn+) BIX” |, and 


V(yn44)= (nAn+)< since 

V (yg = (n/ (041) (VS agg AV (Fy) $2001 (F449 Fy)) 
=(n/(n#1))@ (a ,4#8=((n-1) /n) 4-2 (ata, 447%) 
Sak (iat) ) 


Pig 
fay 


ae A 
Mextyedet y_,,=(n/(nt1) )Bix 4 


then E(}",,)= a xX 4, and 


ar a1 at 
By ae )= x (n/(n41 )) ts S’ Ant a 


“e10) 





ay A 


Mewelet 2 = Yiu4- Ynaq > 


then E(z) 0 


a(n/(n41)) (14(n/(n44)) (X21, 8° XL, 


ae )) « 


Sot lyase = | ce 
Therefore, z~N( 0 , 4(nAnt1)) (14+(nAnt1))X 1S Rae) as 


1 
The standardized variate 2’ = z/(V(z))* is normally 


distributed with mean O and variance 1. Furthermore, 


since (1fa)y™ Oy". Y7 (n-k) and is independent of z', 


tiie ratio 
5 
ne- k 


SS a va 
(nfnet a(n/ntt XS Kner E Gey 


Zz | 
+t Ax 
Toe TInt Inet 


Ec \ 
o& (n-k 


follows a Student t distribution with n-k degree of freedom. 





Kt + 
Proof of independent relationship between Y C Y and gz. 


c* * _ _* aA® ; 
Meee y. and z= Ynti ~ Yney COD be expressed in terms of 


i = (y, > ae see as follows: 
ec y= vt( EC) E)o* (1-4 /n) BE! OY = ¥'D'C"py 


2=F n+ I n+1 
mn nti )( (=~ eee 1)=x"" stot * (Tan) E 
ne aN =—n+1— - fea a 


=(n/n+1)) BY - 


Then it can be seen that BVD'C D = 0, where V=2(AtA') 
+A(I-£) is an (n+i)x(nt1) matrix. 

Therefore, by Th. 4, they are independent. 

Consequently, a two sided 100r% prediction interval to 


x 
contain the future sample as uSing the regression obtained 


rat 


7 _ 
7 
7 _ 
a 


“ot 
; a 








from the n given observations is 


AYA NS. 
= os 4 S < Pans ' re Ww ee 


( (nAn+1))+(nAn+1)) °x* 1, 87! x" 4) (¥" oy") 


ne- k 


Finally, the lower limit, L, and the upper limit, U, for 


Yn+1 are expressed as follows: 


n= ((nt1)/n) (yo 447 tru ys Tn 


U = ((nt+1)/n)(y_,,+ te eae 
Cn Nees wt me  \ i 


ne kK 


are 
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